09. Adversarial & Cycle Consistency Losses
07 Adversarial Cycle Consistency Loss V2
LSGANs
Least squares can partly address the vanishing gradient problem for training deep GANs. The problem is as follows: for negative log-likelihood loss, when an input x is quite big, the gradient can get close to zero and become meaningless for training purposes. However, with a squared loss term, the gradient will actually increase with a larger x, as shown below.
Loss patterns for large x values. Image from the LSGAN paper.
Least square loss is just one variant of a GAN loss. There are many more variants such as a Wasserstein GAN loss and others. These loss variants sometimes can help stabilize training and produce better results. As you write your own code, you're encouraged to hypothesize, try out different loss functions, and see which works best in your case!